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1 Introduction 

I found this paper very interesting, in fact so interesting that I was motivated to 
think carefully about its assumptions and to check some tedious mathematics. I 
noticed what looked like a serious error in the paper's proof of its key inequality 
(9)0 In searching for an alternative to (9), I found a simple, straightforward 
proof of this inequality (based on ideas in the paper, which in turn is based on 
[T]). This is presented in Sections 3 and 4. 

The paper seems fairly clearly written, but since it is not completely explicit 
(e.g., there are symbols whose meaning the reader has to guess), I was worried 
that I might have misinterpreted something. To reduce this possibility, the 
following explains my interpretation of its content in greater detail than usual. 

I thank the authors for their comments and for pointing out a slip, which I 
have corrected. Of course, I take responsibility for any further errors. 

I do assume that the reader is somewhat familiar with the paper and has it at 
hand. The notation follows the paper as much as possible. Any undefined sym- 
bols are as in the paper. Page numbers refer to the version www.arXiv.org/quant- 
ph/0704.2529vl. I have not seen the published version, but since the arXiv 
version is dated April 19, 2007 and the published version appeared days later, 

1 assume that they are identical, or nearly so. 

2 My interpretation of the paper's setup 

For ease of language when introducing the definitions, it will be sometimes be 
convenient to pretend that probability distributions arising are discrete. For 
example, the paper considers pairs of photons with polarizations u, v, ocurring 
with probability density F{u, v). I will sometimes refer to F(u^ v) as the proba- 
bility that photon 1 has polarization u and photon 2 has polarization v, which 
would be correct language if F were a discrete probability distribution. 

A source emits pairs of photons in different directions, as depicted in Figure 
2 of the paper. One photon goes to Alice, and the other to Bob. 



^ Current contact information can be found on my web page, www. math. umb.edu/^sp. 
^The authors have since sent me a revised proof avoiding the error. 
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The probability that Ahcc's photon has polarization u and Bob's has polar- 
ization V is denoted F[u,v). Here u,v represent points on the unit sphere 
in three-dimensional space R^. The standard angular polar coordinates of 
a vector like m arc denoted 9,g,(f>,g, so that u = (cos cisin sin (i sin 0, cos 0). 
This corresponds to a photon represented quantum-mechanically by the ray 
in the two-dimensional complex Hilbert space C^. represented by the vector 
" cos 6*72 1 
e*'^sin6'/2 " 

The paper considers a "hidden variable" A associated with the source. Pre- 
sumably, this can be thought of as a classical label attached by the source to 
each of the pair of emitted photons. The same label is attached to each of the 
photons in an emitted pair, but the label can vary from pair to pair. 

My first impression was that the authors were thinking of the source as 
emitting two photons with polarizations u, v with an additional label A attached 
to each photon, as in their Appendix I example of an explicit non-local hidden- 
variable model. (The set of possible labels A is allowed to depend on u and v, 
as in the example.) However, this seems inconsistent with some of their later 
notation, so I eventually settled on the the interpretation to be described below. 
The two interpretations are essentially equivalent (modulo technicalities), so the 
choice of either is a matter of taste and notation. 

The nature of the label A is not specified and is irrelevant to the proofs. It 
could be a real number in a certain range (depending on u and v), as in the 
Appendix I example, or something more complicated. 

We could use a new label A' defined as a triple A' := (A, u, v), where A is the 
"old label" in the viewpoint above. This is conceptually simpler in that there 
is now only one label A' rather than three. In order to stay close to the paper's 
notation, from now on we write A instead of A' and work with only one label. 

The polarization u of the photon received by Alice is assumed to be a function 
u = a (A) of the hidden variable label attached to her photon, and similarly the 
polarization of Bob's photon is v = /3(A) .The functions a{-),(3{-) (which are 
not part of the paper's notation) are introduced for later convenience instead of 
writing u(A),w(A); certain distinctions are hard to make in the latter notation. 

This could give a classical explanation for correlations between the polar- 
izations of Alice's and Bob's photons. The paper's aim is to show that such 
a classical explanation of observed correlations contradicts both quantum me- 
chanics and experiment. 

The set of possible labels is a probability space, whose probability measure 
will not be named. Since Alice's polarization is a function u = a{X) of the 
hidden variable A, this induces a probability distribution F{u, if) on the set of 
polarization pairs u, v as follows. When the set of A is discrete, the probability 
F[u, v) of a particular polarization pair u, v is the probability of the set of all A 
such that a (A) = u and /3(A) = v. 

When A is a continuous variable, the mathematical object corresponding to 
F{u,v) is a probability measure which might be denoted F{u,v) dudv in the 
special case in which it is given by a probability density function, where du and 
dv represent Lebesgue measure on the unit sphere. We follow the paper by using 
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the notation of a probability density function, with the understanding that the 
measure might have a singular part (e.g., concentrated at a point or on a line). 
A precise mathematical definition might be cumbersome, but the discrete case 
above gives the idea. 

The paper defines "Mains' law" as "the well-known cosine dependence of 
the intensity of a polarized beam after an ideal polarizer" . I take this to mean 
the following. Alice has an instrument to measure polarization in any chosen 
direction a. The only possible results of the measurement are ±1. A reading of 
+ 1" means that the observed polarization was in the direction a and —1 means 
that it was in the opposite direction —a. If she receives many photons with 
polarization u, then the average reading in a - u (which is the cosine of the angle 
between a and u). 

The paper introduces a symbol pn,v-, giving only the cryptic explanation: 
"Each emitted pair is fully defined by the subensemble distribution Pu,vW" I 
take this to mean that Pii,v{') is a conditional probability density function: in 
the discrete case, pa^ijiX) is the probability of A given that the polarizations of 
the emitted pair was m, v. A precise mathematical definition in the generality 
considered by the paper might be cumbersome, but the idea is clear in the 
discrete case: Given a particular m, ?7 and Xq with a(Ao) = u and /3(Ao) = v , 
P«,t)(Ao) is defined as the probability of Aq divided by the probability of the set 
of all A such that a(A) — u and /3(A) — v. 

Suppose Alice sets her instrument to measure polarization in the a direction. 
Bob sets his to measure in the h direction, and the hidden variable attached to 
each of their photons is A. The paper denotes the outcome of Alice's measure- 
ment (either or —1) as A{a,h,X) and Bob's as B{a,b,X). The assumption 
that Malus' law holds is then given by the paper's equations (1) and (2): 

A{u) := J dX pij,jj{X)A{a,b, X) = u ■ a , (1) 

B{v) := J dXpsMX)B{a,b,X)^v-b . (2) 

(I changed the paper's first "=" to the definition symbol ":=" because I think 
it is helpful to the reader to explicitly distinguish between equality by definition 
and assertions of equality between separately defined quantities.) 

These equations seem sensible in terms of the interpretation just described in 
which the source emits two particles, each with just one label (the same label) 
A, which implicitly contains the polarization information. If one is thinking 
of emission of two polarizations u, v along with an additional label A, then 
in equation ([1]), A{a,b,X) should be written A{a,b,u,v, X) (or, less generally, 
A(a, 6, u. A)). In more physical language, what Alice measures is expected to 
depend explicitly on the polarization of the photon she receives. Indeed, the 
Appendix I example writes A — A(d, b, u, A). 

The interpretation above (with just one label A which contains the polariza- 
tion information) was developed to make sense of equations H]) and But 
the two interpretations are equivalent, modulo technicalities and notation. 
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3 Why the hidden variable theory cannot repro- 
duce quantum mechanics 



We are interested in the following two questions. 

1. Can the hidden variable theory described in the previous section reproduce 
the results of quantum mechanics? 

2. If not, how can we experimentally distinguish between quantum mechanics 
and the hidden variable theory? 

This section presents a simple proof that the hidden variable theory cannot 
reproduce the results of quantum mechanics. This conclusion will also follow 
from the results of the next section, which answers question 2, but we present 
it separately because is is a little easier and the result is simpler than the 
paper's (9). The proof of the next section is not much longer than the proof 
of this section, but it seems less motivated. The present section provides the 
motivation, notational preliminaries, and a few simple calculations which enter 
into the proof. 

Before starting, I should acknowledge that the proof's ideas are mostly con- 
tained in the paper under discussion, which is based on [1]. Although in ret- 
rospect, the proof seems simple, I think it would have taken me a long time 
to find it had I been given the problem without the solution hints contained in 
these two references. Any mathematician knows that the first proof is always 
the hardest to construct, and in retrospect is often unnecessarily complicated. 

For given vectors a, 6, define a "correlation function" C(a, b) by 

C{a,b) :^ J dudvd\pa,vWF{u,v)A{a,b,>^)Bia,b,\) . (3) 

Here ps^ff{X), F{u,v), A{d,b, X), and B{d,b,\) are as defined in the paper and 
in the first section above, and / du represents the integral over the unit sphere 
in three-dimensional real Euclidean space (similarly for J dv). 

The correlation C{d,b) is called {AB) in the paper (its equation (4)); we 
introduce the new notation because we shall need to display the dependence of 
{AB) on the "setting vectors" a and b. 

Let a :— cos~^ a • 6 be the angle between a and b. For a system in the 
singlet state (the case considered by the paper), quantum mechanics predicts 
that C{d,b) = —a - b. In the following, it will be helpful to think of a as an 
acute angle (though the proof does not assume this), so that it is expected that 
C{a,b) < 0. For this case, it is a little easier to work with —C{a,b) > 0. 

The paper (following [T]) shows that: 

-1+ J dudvF{u,v)\d-u-b-v\ < -Cia.b) < 1- J dudv F{u,v)\d-u+b-v\. (4) 

Only the right-hand inequality will be used below, which will essentially result 
in establishing half of the paper's inequality (9). The other half follows similarly 
from the left inequality in ([?]), as will be indicated in the next section. 
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According to quantum mechanics, for all a, 



1^ -C{d,d) <1- j dudvF[u,v)\d- {u + v)\ , (5) 

so the integral on the right must vanish. Since the integrand is non-negative, 
this implies that F{u,v) must be concentrated on the singular set of all u^v 
such that V = —u. Restricting to this set, the probability distribution can 
be symbolically represented by a probability density function of just one sphere 
variable u. We denote this new probability density function as Fg (u) and rewrite 
inequality ^ as: 

-C{d,b)<l- I duFs{u)\[d-b)-u)\ . (6) 



Suppose temporarily that unit vectors b ^ ±a, so that a and b are contained 
in a unique plane. Following the paper and [T], we obtain more tractable in- 
equalities by averaging C(a, 5) over rotations in the plane determined by a, 5 
(i.e., rotations about the a x 6 axis). The result, which depends only on the 
plane of rotation and the angle a := cos~^(a • b), will be denoted E(a). More 
explicitly, if R{a) denotes a rotation through the angle a about the axis d x b, 
then 

E{a) -.^ ^ I d<TC[R{a)d,R{a){b)) . (7) 
27r J 

In this notation, E(a) implicitly depends on the plane of a and b. When we 
want to include in the notation that this plane is the x-y plane, we write Exy{a) 
instead of E(a), and similarly Exz{oi) denotes E{a) when a and b lie in the x-z 
plane. 

Next we derive (following the paper and Jj) an inequality for E^yia). For 
any vector u = (u^., Uy, u^) on the unit sphere, write •d.^.y :— (u^, Uy, 0) to denote 
the projection of u to the x-y-plane. Then for any vector q in the x-y plane, 
q ■ u — q ■ Uxy — |9||m£C!/| cos/?, where P is the angle between q and Uxy Hence 
for a, b in the x-y plane, the average oi \ {d — b) ■ u\ over rotations in that plane 
is 

— / da\{Ra{d - b) ■ u\ = \a~b\\uxy\— / dr|cosr| 

Jo Jo 

2 

= -\d-b\\uxy\ , (8) 

TT 

where the integration variable was changed from a to t :— P — a, with f3 the 
angle between u and d — b. Combining this with inequality ([6]) gives 

-E^yia) < 1 - ^\d - b\ J du Fsiu)\uxy\ (9) 
= 1- ^\ sin ^\ J duF,{u)\u,y\ , (10) 
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where the last hnc follows from the routine calculation 

= 2 - 2a • & = 2(1 - cosa) = 4sin^ ^. 

It is hard to deduce more from inequality © without specific knowledge of 
the probability density Fs{u). But adding the x-y and x-z versions of Q gives 
something useful: 

-E,y{a)~E,,{a)<2~-\sm^\ . (11) 

Here we have used the facts that / Fs{u)du = 1 and that \uxy\ + \uxz\ > 1- 
(Proof: i\uxy\ + > \uxy\^ + \ux.\^ ^ ul + ul + ul + > ^ 1.) 

The argument just given assumed that C{d,a) = —1, which implies that 
F{u,v) is concentrated on = —u. If F{u,v) is not concentrated on v = —u, 
then Ercy{0) gives some information about F{u, v) for v ^ u. This suggests that 
it might be productive to look at 

-Exy{a) - Exy{0) , 

as the paper does. 

4 Testing the hidden-variable theory 

Finally, we give a proof of the paper's (9) without assuming that C(a, a) = —1. 
We use the notation of the last section, along with some simple facts established 
there. 

Apply inequality ^ to obtain 



C{d,b) C{d,a) < 2 — y dudv F{u,v)[\a ■ u + b ■ v\ + \a ■ u + a ■ v\] 
= ^~ J F{''^j'v)[\d ■ u + b ■ v\ + \ — d ■ u — d ■ v\ 
< 2 - / F{u,v)\ib-d)-v\] , (12) 



where the last line comes from the triangle inequality, |p| + 1 9| > |p + 91 • 

Let a :~ cos^^ a • 5 be the angle between a and b. Average over rotations in 
the x-yplane to obtain 

- Exy{a) - Exy{0) < 2 ~ \b - d\^ J dudv F{u,v)\vj;y\ 

4 a f 

= 2 \ sin— \ dudvF{u,v)\vxy\ . (13) 

TT Z J 

The same procedure using the left inequality in ([4]) yields 



Cld,b) -\- Cld,d) < J dudv F{u,v)[\d ■ u — b ■ v\ + \d ■ u — d ■ v\ 
< 2- f Fiu,v)\{b-d)-v\] , 
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so 

E^yia) + E^y{0) < 2 
Combining this with gives 

\E,y{a) +E,y{0)\< 2- ^\sm^\ J dudvF{u,v)\v,y\ . (14) 

Do the same for the x-z plane and add the results, recalling from the last section 
that \vxy\ + \vxz\ > 1, to obtain the paper's (9): 

\Exy{a) + Exy{0)\ + \Exzia) + Exz{0)\ <4--|sin^| . 

TT Z 

for the particular choice of orthogonal planes x-y and x-z. 

Of course, the proof just given applies to any two orthogonal planes — the 
particular choice of planes was made to simplify the notation. The paper's state- 
ment of its (9) appears to apply to any two planes, not necessarily orthogonal. 
However, its proof does explicitly assume orthogonal planes (on the top of its 
page 13), so I assume this was intended. 



4 a f 
1 sin— I / dudv F{u,v)\va:y\ 

TT 2 J 



5 Statistical methods 

The paper does not completely explain its statistical methods, and I'm not 
sure I can agree with what is explained. I have questions about the standard 
deviations claimed. The paper states that "the errors [presumably meaning 
standard deviations] are calculated assuming that the counts follow a poissonian 
distribution" . I don't understand this assumption. I'm not sure precisely what 
it means, and under all interpretations which have occurred to me, it seems 
questionable. 

If we were measuring the number of counts observed by Alice in a given 
time interval (say the 10 sec. mentioned on p. 5, during which Alice observes 
about 95,000 counts), that would be expected to follow a Poisson distribution]! 
p{k) — {^^e~^)/k\, where p{k) is the probability of exactly k counts and ^ is 
the mean of the distribution. Also, if we were measuring the number of times 
that Alice and Bob "simultaneously" observe a photon in that 10 seconds, that 
would be expected to follow a Poisson distribution (with a different mean). 
Here "simultaneously" means that Alice and Bob both observe photons at times 
differing by less than some preassigned constant (5 > 0; e.g., they both observe 
a photon at times differing by less than 1 microsecond. But these are not what 
we are measuring. 

What we are measuring is the following. First we select all the occasions on 
which Alice and Bob receive a photon "simultaneously" (as defined in the last 

•^The Poisson distribution was invented to describe the the number of random events ex- 
pected to occur in a given time interval. One of the first uses of it was to describe the number 
of Prussian cavahy which would be kicked to death by horses in a given year! The actual 
numbers matched the distribution very closely. 
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paragraph). Then for each such occasion, we observe the value of a "yes- no" 
random variable which takes the value "yes" if and only if (Alice observes spin 
+1 (relative to her instrument set at a) and Bob observes spin +1 (relative to his 
instrument set at b)) or (Alice observes spin —1 and Bob also observes —1). Then 
we calculate the relative frequency of "yes" answers (the number of occurrences 
of "yes" divided by the total number of simultaneous pairs), a statistic S called 
the "sample mean" (to distinguish it from the usually unknown mean of the 
probability distribution from which the random sample is drawn). The sample 
mean S estimates the probability (call it q) of "yes" . Routine calculation reveals 
that when n simultaneous pairs are observed, the sample mean has standard 
deviation \/q{l — q) / \/n Hence it seems reasonable to estimate the standard 
deviation of the sample mean b}0 



^S{1 - S) 
\fn 

From this, follows easily an estimate for the correlations C := C(a, 6) = 
i?(a, 6)0 Suppose that we observe n photon pairs with n+ "yes" results and n_ 
"no", rt++n_ = n . Then the sample mean S = rij^/n, and the measured corre- 
lation C = n^/n — n_/n = {2n^ — n)/n = 2S—1. Hence the estimated standard 
deviation of i? = C is twice the estimated standard deviation ^^^'(l — S)/n for 
S. 

We can't apply this directly to the results of the paper because the value 
of n (number of photon pairs used to calculate the sample mean) is not given. 
However, we can ask what value of n would yield the paper's claimed error 
of .0118 for £'(02,63) = -.9902 ± .0118 (bottom of p. 6). The claimed error 
[standard deviation] of .0118 for C = i? :— E{a2, 63) corresponds to a standard 
deviation of .0059 for S, so we need to solve the equation 




with (C + l)/2 ^ {E + l)/2 = .0049. 

The solution is n « 140, which seems rather small. The paper mentions 
approximately 3000 photon pairs received in 10 sec. If this were the true value 
of n, then the claimed error of .0018 for £(02,62), which scales with l/y/n, 
would be about 5 times smaller. I wonder if the paper may have inadvertently 
overstated the errors. 

* All of this is standard statistics. For simplicity, I am glossing over some statistical 
subtleties which are unimportant in the present context. For example, calculation reveals that 
the estimator 5(1 — S)/n of the variance of the sample mean is (surprisingly) not "unbiased"; 
to get an unbiased estimator one replaces 5(1 — 5)/n by 5(1 — 5)/(n — 1). For large n, the 
difference is negligible. It is usual to estimate the standard deviation of the sample mean as 
the square root of the estimator for the variance even though this estimator is not unbiased 
with either estimator of the variance. 

am following the paper in assuming that C'{d,b) = E{3,b), where E{d,b) denotes the 
average of C'{a, b) over the plane of a, b. The next section wonders about this assumption. 
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6 Final comments 



As a mathematician who is largely self-taught in physics, I am unsure of the 
correspondence between the physical measurements described in the paper and 
the mathematics of the Poincarc sphere. Is this well-established physics, or is 
it a kind of guess, based on mathematical analogies between complex polariza- 
tion vectors in classical electrodynamics and the two-dimensional complex state 
space describing quantum-mechanical photons? 

I am uneasy about the paper's justification for its assumption that the av- 
erage over a great circle on the Poincare sphere can be confidently replaced by 
an evaluation of the single correlation C (a, b) for a, b on the circle. The paper 
justifies this assumption as follows: (bottom of p. 5): 

"So far, no experimental evidence against the rotational invariance 

of the singlet state exists. We therefore replace the rotation averaged 
correlation functions in inequality (9) with their values measured for 
one pair of settings (in the given plane)." 

It seems dangerous to assume that something is true on the sole grounds that 
no one has proved it false. That risks overlooking potentially important new 
physics. 

My impression is that C{a, b) = —a - b is experimentally well established for 
correlations C{a,b) with a and b in the x-z plane, i.e., linear polarizations. I'm 
not aware of any experiments explicitly validating it for a, b lying in some other 
plane. Are there any? If so, it would be helpful if the paper gave references. 

The results of the paper suggest its confirmation for the y-z plane in that 
correlations in the y-z plane are used in calculating Snlhv on the left side of 
inequality (9), and the measured values of Snlhv are consistent with quantum 
mechanics. However, the actual measured correlations C(a, b) are not given in 
the paper, except for a few special cases at the bottom of p. 6. 

Enough data to suggestively confirm C{a,b) = —a-b for the y-z plane was 
probably gathered in the course of the experiment. It would have been helpful 
had it been presented, if not in the Nature article (which might have had length 
constraints), then in an arXiv report. These experiments are probably hard to 
do, and print is cheap. 
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